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Abstract 

We present a method for verifying the correctness of imperative programs which is based 
on the automated transformation of their specifications. Given a program prog , we con¬ 
sider a partial correctness specification of the form {ip} prog {ip}, where the assertions ip 
and ip are predicates defined by a set Spec of possibly recursive Horn clauses with linear 
arithmetic (LA) constraints in their premise (also called constrained Horn clauses ). The 
verification method consists in constructing a set PC of constrained Horn clauses whose 
satisfiability implies that {ip} prog {ip} is valid. We highlight some limitations of state-of- 
the-art constrained Horn clause solving methods, here called LA-solving methods, which 
prove the satisfiability of the clauses by looking for linear arithmetic interpretations of 
the predicates. In particular, we prove that there exist some specifications that cannot be 
proved valid by any of those LA-solving methods. These specifications require the proof of 
satisfiability of a set PC of constrained Horn clauses that contain nonlinear clauses (that 
is, clauses with more than one atom in their premise). Then, we present a transformation, 
called linearization, that converts PC into a set of linear clauses (that is, clauses with 
at most one atom in their premise). We show that several specifications that could not 
be proved valid by LA-solving methods, can be proved valid after linearization. We also 
present a strategy for performing linearization in an automatic way and we report on some 
experimental results obtained by using a preliminary implementation of our method. 

To appear in Theory and Practice of Logic Programming (TPLP), Proceedings of ICLP 
2015 . 

KEYWORDS'. Program verification, Partial correctness specifications, Horn clauses, Con¬ 
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1 Introduction 

One of the most established methodologies for specifying and proving the cor¬ 
rectness of imperative programs is based on the Floyd-Hoare axiomatic approach 




(see (IHoare 19691) , and also ( [Apt et al. 2 009) for a recent presentation dealing with 
both sequential and concurrent programs). By following this approach, the partial 
correctness of a program prog is formalized by a triple {p} prog {if}, also called 
partial correctness specification , where the precondition ip and the postcondition if 
are assertions in first order logic, meaning that if the input values of prog satisfy p 
and program execution terminates, then the output values satisfy if. 


It is well-known that the problem of checking partial correctness of programs with 
respect to given preconditions and postconditions is undecidable. In particular, the 
undecidability of partial correctness is due to the fact that in order to prove in Hoare 
logic the validity of a triple {</?} prog {if}, one has to look for suitable auxiliary 
assertions, the so-called invariants, in an infinite space of formulas, and also to cope 
with the undecidability of logical consequence. 

Thus, the best way of addressing the problem of the automatic verification of 
programs is to design incomplete methods, that is, methods based on restrictions 
of first order logic, which work well in the practical cases of interest. To achieve 
this goal, some methods proposed in the literature in recent years use linear arith¬ 
metic constraints as the assertion language and constrained Horn clauses as the 
formalism to express and reason about program correctness ( [Bjprner et al. 2012| 

|De Angelis et al. 20 14a|[Grebenshchikov et al. 20f2ll.Taffar et al. 2012HPeralta et al. 19981 

|Podelski and Rybalchenko 2007| IRummer et al. 2013]) . 


Constrained Horn clauses are clauses with at most one atom in their conclu¬ 
sion and a conjunction of atoms and constraints over a given domain in their 
premise. In this paper we will only consider constrained Horn clauses with lin¬ 
ear arithmetic constraints. The use of this formalism has the advantage that logical 
consequence for linear arithmetic constraints is decidable and, moreover, reasoning 
within constrained Horn clauses is supported by very effective automated tools, 
such as Satisfiability Modulo Theories (SMT) solvers ( de Moura and Bjprner 2008| 
ICimatti et al. 20131 IRummer et al. 2013|) and constraint logic programming (CLP) 
inference systems (IJaffar and Maher 19941) . However, current approaches to correct¬ 
ness proofs based on constrained Horn clauses have the disadvantage that they only 
consider specifications whose preconditions and postconditions are linear arithmetic 
constraints. 


In this paper we overcome this limitation and propose an approach to proving 
general specifications of the form { p } prog {if}, where p and if are predicates defined 
by a set of possibly recursive constrained Horn clauses (not simply linear arithmetic 
constraints), and prog is a program written in a C-like imperative language. 

First, we indicate how to construct a set PC of constrained Horn clauses (PC 
stands for partial correctness), starting from: (i) the assertions p and if, (ii) the 
program prog , and (iii) the definition of the operational semantics of the language 
in which prog is written, such that, if PC is satisfiable, then the partial correctness 
specification {tp} prog {if} is valid. 

Then, we formally show that there are sets PC of constrained Horn clauses 
encoding partial correctness specifications, whose satisfiability cannot be proved by 
current methods, here collectively called LA-solving methods (LA stands for linear 
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arithmetic). This limitation is due to the fact that TA-solving methods try to prove 
satisfiability by interpreting the predicates as linear arithmetic constraints. 

For these problematic specifications, the set PC of constrained Horn clauses con¬ 
tains nonlinear clauses, that is, clauses with more than one atom in their premise. 

Next, we present a transformation, which we call linearization , that converts 
the set PC into a set of linear clauses, that is, clauses with at most one atom in 
their premise. We show that linearization preserves satisfiability and also increases 
the power of LA-solving, in the sense that several specifications that could not be 
proved valid by LA-solving methods, can be proved valid after linearization. Thus, 
linearization followed by LA-solving is strictly more powerful than LA-solving alone. 

The paper is organized as follows. In Section [2] we show how a class of partial 
correctness specifications can be translated into constrained Horn clauses. In Sec¬ 
tion [3] we prove that LA-solving methods are inherently incomplete for proving the 
satisfiability of constrained Horn clauses. In Section [4] we present a strategy for au¬ 
tomatically performing the linearization transformation, we prove that it preserves 
TA-solvability, and (in some cases) it is able to transform constrained Horn clauses 
that are not LA-solvable into constrained Horn clauses that are LA-solvable. Fi¬ 
nally, in Section [5] we report on some preliminary experimental results obtained by 
using a proof-of-concept implementation of the method. 


2 Translating Partial Correctness into Constrained Horn Clauses 


We consider a C-like imperative programming language with integer variables, as¬ 
signments, conditionals, while loops, and goto’s. An imperative program is a se¬ 
quence of labeled commands (or commands, for short), and in each program there 
is a unique halt command that, when executed, causes program termination. 

The semantics of our language is defined by a transition relation , denoted =>, 
between configurations. Each configuration is a pair ((t : c, d)) of a labeled com¬ 
mand t : c and an environment 5. An environment <5 is a function that maps ev¬ 
ery integer variable identifier x to its value v in the integers Z. The definition 
of the relation => is similar to that of the ‘small step’ operational semantics pre¬ 
sented in ( [Reynolds 1998 ), and is omitted. Given a program prog , we denote by to : cq 
its first labeled command. 

We assume that all program executions are deterministic in the sense that, 
for every environment do, there exists a unique, maximal (possibly infinite) se¬ 
quence of configurations, called computation sequence, of the form: ((to : Co, do)) =>■ 
((ti: ci, di)) =>- • • •. We also assume that every finite computation sequence ends 
in the configuration ((t^ :halt, d ra )), for some environment 5 n . We say that a pro¬ 
gram prog terminates for do iff the computation sequence starting from the initial 
configuration ((to : cq, do)) is finite. 
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2.1 Specifying Program Correctness 

First we need the following notions about constraints, constraint logic programming, 
and constrained Horn clauses. For related notions with which the reader is not 
familiar, he may refer to (I.Taffar and Mahe~r~ 1994: Llo yd 19871 ) ■ 

A constraint is a linear arithmetic equality (=) or inequality (>) over the inte¬ 
gers Z, or a conjunction or a disjunction of constraints. For example, 2 -X > 3 ■ Y — 4 
is a constraint. We feel free to say ‘linear arithmetic constraint’, instead of ‘con¬ 
straint’. We denote by Cla the set of all constraints. An atom is an atomic formula 
of the form p(t\, ..., i m ), where p is a predicate symbol not in {=, >} and t x , ..., t m 
are terms. Let Atom be the set of all atoms. A definite clause is an implication of the 
form A <— c, G, where in the conclusion (or head) A is an atom, and in the premise 
(or body ) c is a constraint, and G is a (possibly empty) conjunction of atoms. 
A constrained goal (or simply, a goal ) is an implication of the form false <— c, G. 
A constrained Horn clause (CHC) (or simply, a clause) is either a definite clause or 
a constrained goal. A constraint logic program (or simply, a CLP program) is a set of 
definite clauses. A clause over the integers is a clause that has no function symbols 
except for integer constants, addition, and multiplication by integer constants. 

The semantics of a constraint c is defined in terms of the usual interpretation, 
denoted by LA, over the integers Z. We write LA |= c to denote that c is true in LA. 
Given a set S of constrained Horn clauses, an LA-interpretation is an interpretation 
for the language of S that agrees with LA on the language of the constraints. An 
LA-model of S is an LA-interpretation that makes all clauses of S true. A set of 
constrained Horn clauses is satisfiable if it has an LA-model. A CLP program P is 
always satisfiable and has a least LA-model , denoted M(P). We have that a set S 
of constrained Horn clauses is satisfiable iff S = PLiG, where P is a CLP program, 
G is a set of goals, and M(P) \= G. Given a first order formula ip, we denote by 
3(</?) its existential closure and by V(y>) its universal closure. 

Throughout the paper we will consider partial correctness specifications which 
are particular triples of the form {</?} prog {ip} defined as follows. 

Definition 1 (Functional Horn Specification) 

A partial correctness triple {<p} prog {ip} is said to be a functional Horn specifica¬ 
tion if the following assumptions hold, where the predicates pre and / are assumed 
to be defined by a CLP program Spec: 

(1) is the formula: z\ =p\ A ... A z s =p s A pre{p\,... , p s ), where z\,...,z s are the 
variables occurring in prog , and pi,... ,p s are variables (distinct from the z0s), 
called parameters (informally, pre determines the initial values of the Zj’s); 

(2) ip is the atom f(pi,... ,p s , Zk), where Zk is a variable in {z\, ..., z s } (informally, 
Zk is the variable whose final value is the result of the computation of prog); 

(3) / is a relation which is total on pre and functional , in the sense that the following 
two properties hold (informally, / is the function computed by prog): 

(3.1) M(Spec) \= Vpi, ■ ■ .,p s . pre{pi,...,p s ) -» 3y. f{p 1 ,...,p s ,y) 

(3.2) M {Spec) \=\/p 1 ,...,p s ,y 1 ,y 2 . f(pi, ■ ■ ■ ,Ps, J/i)A/(pi,... ,p s , y 2 ) -> yi = 2 / 2 - □ 
We say that a functional Horn specification {<p} prog {ip} is valid , or prog is par¬ 
tially correct with respect to p and ip, iff for all environments S o and S n , 
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if M(Spec)\= pre(So{zi),..., 5q(z s )) holds (in words, So satisfies pre) and {£o:co,Sq} 
=>* {lh :halt, S n )} holds (in words, prog terminates for So) holds, then M(Spec) |= 
f(So{z\),..., So(z s ), S n (zk)) holds (in words, S n satisfies the postcondition). 

The relation rprog computed by prog according to the operational semantics of 
the imperative language, is defined by the CLP program OpSem made out of: (i) the 
following clause R (where, as usual, variables are denoted by upper-case letters): 

R■ r pr0 g(Pi,. ■ ■ ,P s ,Z k ) <- initCf (C 0 , Pi,... ,P S ), reach{C 0 , C h ), finalCf (C h , Z k ) 
where: 

(1.1) initCf (Co,Pi,... ,P S ) represents the initial configuration Co, where the variables 
Zi,. . ., 2 s are bound to the values Pi,.. .,P S , respectively, and pre(P\ . .,P S ) holds, 

(1.2) reach(Co, Ch) represents the transitive closure =>* of the transition relation 
=>, which in turn is represented by a predicate tr(Ci, C 2 ) that encodes the 
operational semantics, that is, the interpreter of our imperative language, by 
relating a source configuration C\ to a target configuration C 2 , 

(1.3) finalCf (Ch, Z k ) represents the final configuration Ch, where the variable z k is 
bound to the value Z k , 

and (ii) the clauses for the predicates pre(P \,..., P s ) and tr(Ci,C 2 ). The clauses 
for the predicate triCi, 62 ) are defined as indicated in ( |De Angelis et al. 2014a| , 
and are omitted for reasons of space. 

Example 1 (Fibonacci Numbers) 

Let us consider the following program fibonacci , that returns as value of u the n-th 
Fibonacci number, for any n (> 0), having initialized u to 1 and v to 0 . 

0 : while (n> 0 ) { t=u; u=u+v; v=t; n=n-l } fibonacci 

h: halt 

The following is a functional Horn specification of the partial correctness of the 
program fibonacci: 

(n=N, N>=0, u=l, v=0, t=0} fibonacci {fib(N,u)} (J) 

where N is a parameter and fib is defined by the following CLP program: 

51. fib(0,l). Specjibonacci 

52. fib(l,l). 

53. fib(N3,F3) N1>=0, N2=N1+1, N3=N2+1, F3=F1+F2, f ib(Nl ,F1) , f ib(N2 ,F2) . 
For reasons of conciseness, in the above specification (f) we have slightly deviated 
from Definition[TJ In particular, we did not introduce the predicate symbol pre, and 
in the precondition and postcondition we did not introduce the parameters which 
have constant values. 

The relation r_f ibonacci computed by the program fibonacci according to the 
operational semantics, is defined by the following CLP program: 

OpSem ibonacci 

Rl. r _f ibonacci (N,U) initCf (CO, N) , reach(C0, Ch) , finalCf (Ch,U) . 

R2. initCf(cf(LC.E),N) N>=0, U=l, V=0, T=0, firstCmd(LC), 

env((n,N),E), env((u,U),E), env((v,V),E), env((t,T),E). 

R3. finalCf(cf(LC,E),U) :- haltCmd(LC), env((u,U),E). 

where: (i) f irstCmd(LC) holds for the command with label 0 of the program fibo¬ 
nacci', (ii) env ((x, X), E) holds iff in the environment E the variable x is bound to the 


5 







value of X; (iii) in the initial configuration CO the environment E binds the variables 
n, u, v, t to the values N (>=0), 1, 0, and 0, respectively; and (iv) haltCmd(LC) 
holds for the labeled command h: halt. □ 


2.2 Encoding Specifications into Constrained Horn Clauses 

In this section we present the encoding of the validity problem of functional Horn 
specifications into the satisfiability problem of CHC’s. 

For reasons of simplicity we assume that in Spec no predicate depends on / 
(possibly, except for / itself), that is, Spec can be partitioned into two sets of clauses, 
call them F^ e j and Aux , where is the set of clauses with head predicate /, and / 
does not occur in Aux. 

Theorem 1 (Partial Correctness) 

Let Fpcorr be the set of goals derived from F^ e j as follows : for each clause D £ F^j 
of the form f(X i,..., X s . Y) 4— B , 

(1) every occurrence of / in D (and, in particular, in B) is replaced by rp r0 g , thereby 
deriving a clause E of the form: r pr og{X i ,... ,X S , Y) 4— B, 

(2) clause E is replaced by the goal G: false 4— Y ^ Z, r pr og{X i, ..., X s , Z ), B, 
where Z is a new variable, and 

(3) goal G is replaced by the following two goals: 

G\. false 4— Y>Z , r pr0 g(Xi ,..., X s , Z), B 
C?2- false 4— Y<Z , r p rog{X\,. ■., X s , Z), B 

Let PC be the set F pcorr L)AuxU OpSem of CHC’s. We have that: if PC is satisfiable, 
then {ip} prog {ip} is valid. □ 

The proof of this theorem and of the other facts presented in this paper can be found 
in the online appendix. In our Fibonacci example (see Example []J the set F^ of 
clauses is the entire set Specy^onacci and Aux= 0. According to Points (l)-(3) of 
Theorem [lj from Bpecy^ onacc j we derive the following six goals: 

Gl. false :- F>1, r_fibonacci(0,F). 

G2. false :- F<1, r_fibonacci(0,F). 

G3. false :- F>1, r_fibonacci(1,F). 

G4. false :- F<1, r_fibonacci(1,F). 

G5. false :- N1>=0, N2=N1+1, N3=N2+1, F3>F1+F2, 

r_fibonacci(N1,FI), r_fibonacci(N2,F2), r_fibonacci(N3,F3). 

G6. false :- N1>=0, N2=N1+1, N3=N2+1, F3<F1+F2, 

r_fibonacci(N1,FI), r_fibonacci(N2,F2), r_fibonacci(N3,F3). 

Thus, in order to prove the validity of the specification (|) above, since Aux=V), it 
is enough to show that the set PCjn, onacc i= {Gl,..., G6}U OpSem jii )0nacc i of CHC’s 
is satisfiable. 


3 A Limitation of TA-solving Methods 

Now we show that there are sets of CHC’s that encode partial correctness specifi¬ 
cations whose satisfiability cannot be proved by TA-solving methods. 

A symbolic interpretation is a function E : Atom —> Cla such that, for every 
A£Atom and substitution d, E(A$) = Y,(A)d. Given a set S of CHC’s, a symbolic 
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interpretation £ is an LA-solution of S iff, for every clause Aq t— c, A \,..., A n in 
S, we have that LA ^(cA £(Ai) A ... A £(A„)) —» £(A 0 ). 

We say that a set S of CHC’s is LA-solvable if there exists an LA-solution of S. 
Clearly, if a set of CHC’s is PA-solvable, then it is satisfiable. The converse does 
not hold as we now show. 

Theorem 2 

There are sets of constrained Horn clauses which are satisfiable and not PA-solvable. 

Proof. Let PCjH, on acci be the set of clauses that encode the validity of the Fibonacci 
specification ($). PCy^ onacc j is satisfiable, because r_fibonacci(N, F) holds iff F is 
the N-th Fibonacci number, and hence the bodies of Gl,..., G6 are false. (This fact 
will also be proved by the automatic method presented in Section [I]) 

Now we prove, by contradiction, that PCy^onacci is not PA-solvable. Suppose 
that there exists an PA-solution £ of PCy^ onacc j- Let £(r_f ibonacci(N, F)) be a 
constraint c(N,F). To keep our proof simple, we assume that c(N,F) is defined by a 
conjunction of linear arithmetic inequalities (that is, c(N, F) is a convex constraint), 
but our argument can easily be generalized to any constraint in Cl a- By the defi¬ 
nition of PA-solution, we have that: 

(PI) LA ^ 3(N1 >0, N2 = N1 + 1, N3 = N2 + 1, F3>F1+F2, c(Nl, Fl), c(N2, F2), c(N3, F3)) 

(P2) M(OpS'em^ 6onacd ) |= V (r_f ibonacci(N, F) c(N,F)) 

Property (PI) follows from the fact that, in particular, an PA-solution satisfies 
goal G5. Property (P2) follows from the fact that an LA-solution satisfies all clauses 
of Op5em^^ onacc j and M( OpSemjHj 0nacc f) defines the least r_fibonacci relation 
that satisfies those clauses. 

From Property (P2) and from the fact that r_f ibonacci(N, F) holds iff F is the 
N-th Fibonacci number (and hence F is an exponential function of N), it follows that 
c(N,F) is a conjunction of the form ci(N,F),..., c k (N,F), where, for i = 1,...,k, 
with k>0, Ci(N,F) is either (A) N>ai, for some integer a l7 or (B) F>ai-N-|-bi. (No 
constraints of the form F <ai-N+bi are possible, as shown in Figure!]]) 



Figure 1. The relation r_fibonacci(N,F) and the convex constraint c(N,F). 

By replacing c (N1 ,F1), c (N2 ,F2), and c (N3 ,F3) by the corresponding conjunctions 
of atomic constraints of the forms (A) and (B), and eliminating the occurrences of 
Fl, F2, N2, and N3, from (PI) we get: 

(P3) LA (A3(N1>0, F3>pi,..., F3>pn) 
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where, for i = pi is a linear polynomial in the variable Nl. Then, the 

constraint N1>0, F3>pi,..., F3>p n is satisfiable and Property (-P3) is false. Thus, 
the assumption that -PC^onacci is LA-solvable is false, and we get the thesis. □ 

4 Increasing the Power of LA-Solving Methods by Linearization 

A weakness of the LA-solving methods is that they look for LA-solutions con¬ 
structed from single atoms, and by doing so they may fail to discover that a goal is 
satisfiable because a conjunction of atoms in its premise is unsatisfiable, in spite of 
the fact that each of its conjoint atoms is satisfiable. For instance, in our Fibonacci 
example the premise of goal G5 contains three atoms with predicate r_f ibonacci and 
our proof of Section [3] shows that, even if the premise of G5 is unsatisfiable, there 
is no constraint which is an LA-solution of the clauses defining r_f ibonacci that, 
when substituted for each r_f ibonacci atom, makes that premise false. Thus, the 
notion of LA-solution shows some weakness when dealing with nonlinear clauses, 
that is, clauses whose premise contains more than one atom (besides constraints). 

In this section we present an automatic transformation of constrained Horn 
clauses that has the objective of increasing the power of LA-solving methods. 

The core of the transformation, called linearization , takes a set of possibly non¬ 
linear constrained Horn clauses and transforms it into a set of linear clauses, that is, 
clauses whose premise contains at most one atom (besides constraints). After per¬ 
forming linearization, the LA-solving methods are able to exploit the interactions 
among several atoms, instead of dealing with each atom individually. In particular, 
an LA-solution of the linearized set of clauses will map a conjunction of atoms to a 
constraint. We will show that linearization preserves the existence of LA-solutions 
and, in some cases (including our Fibonacci example), transforms a set of clauses 
which is not LA-solvable into a set of clauses that is LA-solvable. 

Our transformation technique is made out of the following two steps: 

(1) RI: Removal of the interpreter , and (2) LIN: Linearization. 

These steps are performed by using the transformation rules for CLP programs pre¬ 
sented in (lEtalle and Gabbriclli 1996)1 . that is: unfolding (which consists in applying 
a resolution step and a constraint satisfiability test), definition (which introduces a 
new predicate defined in terms of old predicates), and folding (which redefines old 
predicates in terms of new predicates introduced by the definition rule). 


f.l RI: Removal of the Interpreter 

This step is a variant of the removal of the interpreter transformation presented 
in QDe Angelis et al. 2014a[ ). In this step a specialized definition for rprog is derived 
by transforming the CLP program OpSem , thereby getting a new CLP program 
OpSempi where there are no occurrences of the predicates initCf finalCf reach , 
and tr , which as already mentioned encodes the interpreter of the imperative lan¬ 
guage in which prog is written. (See online appendix for more details.) 

By a simple extension of the results presented in flDc Angelis et al. 2014a| ), it 
can be shown that the RI transformation always terminates, preserves satisfiability, 








and transforms OpSem into a set of linear clauses over the integers. It can also be 
shown that the removal of the interpreter preserves LA-solvability. Thus, we have 
the following result. 

Theorem 3 

Let OpSem be a CLP program constructed starting from any given imperative 
program prog. Then the RI transformation terminates and derives a CLP program 
OpSemftp such that: 

(1) OpSempu is a set of linear clauses over the integers; 

(2) OpSem U Aux U Fp Corr is satisfiable iff OpSem^p U Aux U Fp Corr is satisfiable; 

(3) OpSem UAuxUF pcorr is LA-solvable iff OpSemppp UAuxLlF pcorr is LA-solvable. 


In the Fibonacci example, the input of the RI transformation is OpSem^ onacc p 
The output of the RI transformation consists of the following three clauses: 

El. r_fibonacci(N,F) N>=0, U=l, V=0, T=0, r(N,U,V,T,Nl,F,Vl,Tl) . 

E2. r(N,U,V,T,N,U,V,T)N=<0. 

E3. r(N,U,V,T,N2,U2 ) V2,T2) N>=1, N1=N-1, U1=U+V, V1=U, T1=U, 

r(Nl,Ul,Vl,Tl,N2,U2,V2,T2). 


where r is a new predicate symbol introduced by the RI transformation. 

As stated by Theorem [3l OpSemRj is a set of clauses over the integers. Since the 
clauses of the specification Spec define computable functions from Z s to Z, without 
loss of generality we may assume that also the clauses in Aux U Fp Corr are over the 
integers flSebelik and Step anek 1982). From now on we will only deal with clauses 
over the integers, and we will feel free to omit the qualification ‘over the integers’. 


4-2 LIN: Linearization 

The linearization transformation takes as input the set OpSem ppp U Aux U Fp Corr of 
constrained Horn clauses and derives a new, equisatisfiablc set TransfCls of linear 
constrained Horn clauses. 

In order to perform linearization, we assume that Aux is a set of linear clauses. 
This assumption, which is not restrictive because any computable function on the 
integers can be encoded by linear clauses dSebclik and Stepanek 1982| , simplifies 
the proof of termination of the transformation. 

The linearization transformation is described in Figure[2] Its input is constructed 
by partitioning OpSem U Aux U Fp Corr into a set LCls of linear clauses and a set 
NLGls of nonlinear goals. LCls consists of Aux, OpSemp^p (which, by Theorem^ is 
a set of linear clauses), and the subset of linear goals in F pcorr . NLGls consists of 
the set of nonlinear goals in F pcorr . 

When applying linearization we use the following transformation rule. 

Unfolding Rule. Let Cls be a set of constrained Horn clauses. Given a clause C of 
the form ffec, Ls, A, Rs , let us consider the set {Aj t— c*, Bi \ i = 1,..., m} made 
out of the (renamed apart) clauses of Cls such that, for i = l,... , m, A is unifiable 
with Ki via the most general unifier di and (c, cf) di is satisfiable. By unfolding C 
with respect to A using Cls, we derive the set {( Ft <— c, d, Ls, Bi, Rs) di \ i = 
1,..., m} of clauses. 
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Input: (i) A set LCls of linear clauses, and (ii) a set Gls of nonlinear goals. 
Output: A set TransfCls of linear clauses. 


Initialization: NLCls := Gls-, Defs := 0; TransfCls := LCls; 

while there is a clause C in NLCls do 

Unfolding: From clause C derive a set U(C) of clauses by unfolding C with respect to 
every atom occurring in its body using LCls; 

Rewrite each clause in U(C) to a clause of the form H <— c, Ai,..., Ak, such that, for 
i = 1,..., k, Ai is of the form p(X i,..., X m ); 

Definition & Folding: 

F(C) := U(C); 

for every clause E € F(C) of the form H «— c, Ai ,..., Ak do 

if in Defs there is no clause of the form newp(X i,... , Xt) <— A i,..., Ak, where 
{Xi,.. .,Xt] = vars(A \,..., Ak) n vars(H, c) 
then add newp(X i,..., Xt) Ai ,..., At to Defs and to NLCls; 

F{C) := ( F(C) - {E}) U {H<-c, newp{Xk,..., X t )} 
end-for 

NLCls := NLCls - {C}; TransfCls := TransfCls U F(C); 
end-while 


Figure 2. LIN: The linearization transformation. 

It is easy to see that, since LCls is a set of linear clauses, only a finite number 
of new predicates can be introduced by any sequence of applications of Defini¬ 
tion & Folding, and hence the linearization transformation terminates. Moreover, 
the use of the unfolding, definition, and folding rules according to the conditions 
indicated in (lEtallc and Gabbrielli 19961) . guarantees the equivalence with respect 
to the least LA-model, and hence the equisatisfiability of LCls U Gls and TransfCls. 
Thus, we have the following result. 

Theorem f (Termination and Correctness of Linearization) 

Let LCls be a set of linear clauses and Gls be a set of nonlinear goals. The lin¬ 
earization transformation terminates for the input set of clauses LCls U Gls, and 
the output TransfCls is a set of linear clauses. Moreover, LCls U Gls is satisfiable 
iff TransfCls is satisfiable. □ 

Let us consider again the Fibonacci example. We apply the linearization transfor¬ 
mation to the set {E1,E2,E3} of linear clauses, and to the nonlinear goal G5. For 
brevity, we omit to consider the cases where the goals Gl,... ,G4,G6 are taken as 
input to the linearization transformation. 

After Initialization we have that NLCls = {G5}, Defs = 0, and TransfCls = 
{E1,E2,E3}. By applying the Unfolding step to G5 we derive: 

Cl. false :- Nl>= 0, N2=N1+1, N3=N2+1, F3>F1+F2, U=l, V=0, 

r(Nl,U,V,V,Xl,Fl,Yl,Zl), r(N2,U,V,V ) X2,F2,Y2 ) Z2), 
r(N3,U,V,V,X3,F3,Y3,Z3). 

Next, by Definition & Folding, the following clause is added to NLCls and Defs: 
C2. newl(N1,U,V,F1,N2,F2,N3,F3) r(Nl,U,V,V,Xl,Fl,Yl,Zl), 

r(N2,U,V,V,X2,F2,Y2,Z2), rCNS.U.V.V.XS.FS.YS.ZS). 
and clause Cl is folded using C2, thereby deriving the following linear clause: 

C3. false :- Nl>= 0, N2=N1+1, N3=N2+1, F3>F1+F2, U=l, V=0, 
newl(N3,U,V,F3,N2,F2,N1,F1). 

At the end of the first execution of the body of the while-do loop we have: NLCls = 
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{C2}, Defs = {C2}, and TransfCls = {E1,E2,E3,C3}. Now, the linearization trans¬ 
formation continues by processing clause C2. During its execution, linearization 
introduces two new predicates defined by the following two clauses: 

C4. new2(N,U,V,F) r(N,U,V,V,X,F,Y,Z). 

C5. new3(N2,U,V,F2,Nl,Fl) :-r(Nl,U,V,V,Xl,F1,Y1,Z1), r(N2,U,V,V,X2,F2,Y2,Z2). 

The transformation terminates when all clauses derived by unfolding can be 
folded using clauses in Defs , without introducing new predicates. The output of 
the transformation is a set of linear clauses (listed in the online appendix) which is 
LA-solvable, as reported on line 4 of Table [1] in the next section. 

In general, there is no guarantee that we can automatically transform any given 
satisfiable set of clauses into an LA-solvable one. In fact, such a transformation 
cannot be algorithmic because, for constrained Horn clauses, the problem of satis¬ 
fiability is not semidecidable, while the problem of LA-solvability is semidecidable 
(indeed, the set of symbolic interpretations is recursively enumerable and the prob¬ 
lem of checking whether or not a symbolic interpretation is an LA-solution is de¬ 
cidable). However, the linearization transformation cannot decrease LA-solvability, 
as the following theorem shows. 

Theorem 5 (Monotonicity with respect to LA-Solvability) 

Assume that by applying the linearization transformation to a set LCls U Gls of 
CHC’s, we obtain a set TransfCls. If LCls U Gls is LA-solvable, then TransfCls is 
LA-solvable. □ 

Since there are cases where LCls U Gls is not LA-solvable, while TransfCls is 
LA-solvable (see the Fibonacci example above and some more examples in the 
following section), as a consequence of Theorem [5] we get that the combination of 
LA-solving and linearization is strictly more powerful than LA-solving alone. 


5 Experimental Results 


We have implemented our verification method by using the VeriMAP system QDe Angelis et al. 2014b[ ). 
The implemented tool consists of four modules, which we have depicted in Figurc[3] 

The first module, given the imperative program prog and its specification Spec , gen¬ 
erates the set PC of constrained Horn clauses (see Theorem [TJ. PC is then given 
as input to the module RI that removes the interpreter. Then, the module LIN 


performs the linearization transformation. Finally, the resulting linear clauses are 
passed to the LA-solver, consisting of VeriMAP together with an SMT solver, which 
is either Z3 (de Moura and Bjprner 20081 or MathSAT (jCimatti et al. 20131) or El- 
darica (IRiimmer et al. 20131 ). 


prog 


Spec 


Constraint 
Horn Clause 
Generator 




RI 




LA-solver unknown 

' 


_f 



(f - 



VeriMAP 


Z3 / 



LIN 



(Constraints 


MathSAT / 


true/false 



Propagation) 


Eldarica 



v- 








Figure 3. Our software model checker that uses the linearization module LIN. 

We performed an experimental evaluation on a set of programs taken from the 
literature, including some programs from (|Felsing et al. 2014|) obtained by applying 
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strength reduction, a real-world optimization technique^]. In Table |T] we report the 
results of our experimental 

One can see that linearization takes very little time compared to the total ver¬ 
ification time. Moreover, linearization is necessary for the verification of 14 out 
of 19 programs (including fibonacci ), which otherwise cannot be proved correct 
with respect to their specifications. In the two columns under LA-solving-1 we re¬ 
port the results obtained by giving as input to the Z3 and Eldarica solvers the 
set PC generated by the RI module. Under LA-solving-1 we do not have a column 
for MathSAT, because the version of this solver used in our experiments (namely, 
MSATIC3) cannot deal with nonlinear CHC’s, and therefore it cannot be applied 
before linearization. In the last three columns of Tabic [T] we report the results ob¬ 
tained by giving as input to VeriMAP (and the solvers Z3, MatSAT, and Eldarica, 
respectively) the clauses obtained after linearization. 

Unsurprisingly, for the verification problems where linearization is not necessary, 
our technique may deteriorate the performance, although in most of these problems 
the solving time does not increase much. 


Program 

RI 

ZM-solving-l 

LIN 

LA-solving-2: VeriMAP & 

Z3 

Eldarica 

Z3 

MathSAT 

Eldarica 

1. binary .division 

0.02 

4.16 

TO 

0.04 

17.36 

17.87 

20.98 

2. fast-multiplication- 2 

0.02 

TO 

3.71 

0.01 

1.07 

1.97 

7.59 

3. fast-multiplication-3 

0.03 

TO 

4.56 

0.02 

2.59 

2.54 

9.31 

4. fibonacci 

0.01 

TO 

TO 

0.01 

2.00 

47.74 

6.97 

5. Dijkstra.fusc 

0.01 

1.02 

3.80 

0.05 

2.14 

2.80 

10.26 

6. greatest-common-divisor 

0.01 

TO 

TO 

0.01 

0.89 

1.78 

0.04 

7. integer-division 

0.01 

TO 

TO 

0.01 

0.88 

1.90 

2.86 

8. 91-function 

0.01 

1.27 

TO 

0.06 

117.97 

14.24 

TO 

9. integer-multiplication 

0.02 

TO 

TO 

0.01 

0.52 

14.76 

0.54 

10. remainder 

0.01 

TO 

TO 

0.01 

0.87 

1.70 

3.16 

11. sum-first-integers 

0.01 

TO 

TO 

0.01 

1.79 

2.30 

6.81 

12. lucas 

0.01 

TO 

TO 

0.01 

2.04 

8.39 

9.46 

13. padovan 

0.01 

TO 

TO 

0.01 

2.24 

TO 

11.62 

14. perrin 

0.01 

TO 

TO 

0.02 

2.23 

TO 

11.89 

15. hanoi 

0.01 

TO 

TO 

0.01 

1.81 

2.07 

6.59 

16. digitslO 

0.01 

TO 

TO 

0.01 

4.52 

3.10 

6.54 

17. digitslO-itmd 

0.06 

TO 

TO 

0.04 

TO 

10.26 

12.38 

18. digitslO-opt 

0.08 

TO 

TO 

0.10 

TO 

TO 

15.80 

19. digitslO-optlOO 

0.01 

TO 

TO 

0.02 

TO 

58.99 

8.98 


Table 1. Columns RI and LIN show the times (in seconds) taken for removal of the 
interpreter and linearization. The two columns under LA-solving-1 show the times 
taken by Z3 and Eldarica for solving the problems after RI alone. The three columns 
under LA-solving-2 show the times taken by VeriMAP together with Z3, MathSAT, 
and Eldarica, after RI and LIN. The timeout TO occurs after 120 seconds. 


1 https://www.facebook.com/notes/facebook-engineering/three-optimization-tips-for-c/ 
10151361643253920 

2 The VeriMAP tool, source code and specifications for the programs are available at: 

http://map.uniroma2.it/linearization 
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6 Conclusions and Related Work 


We have presented a method for proving partial correctness specifications of pro¬ 
grams, given as Hoare triples of the form {ip} prog {-0}, where the assertions ip and il> 
are predicates defined by a set of possibly recursive , definite CLP clauses. Our ver¬ 
ification method is based on: Step (1) a translation of a given specification into a 
set of constrained Horn clauses (that is, a CLP program together with one or more 
goals), Step (2) an unfold/fold transformation strategy, called linearization, which 
derives linear clauses (that is, clauses with at most one atom in their body), and 
Step (3) an LA-solver that attempts to prove the satisfiability of constrained Horn 
clauses by interpreting predicates as linear arithmetic constraints. 

We have formally proved that the method which uses linearization is strictly more 
powerful than the method that applies Step (3) immediately after Step (1). We have 
also developed a proof-of-concept implementation of our method by using the Ver- 
iMAP verification system ( |De Angelis et al. 2014bj ) together with various state-of- 
the-art solvers (namely, Z3 flde Moura and Bjprner 2008| ), MathSAT (|Cimatti et al. 2013p . 
and Eldarica (IRiimmer et al. 20131) 1. and we have shown that our method works on 
several verification problems. Although these problems refer to quite simple spec¬ 
ifications, some of them cannot be solved by using the above mentioned solvers 
alone. 

The use of transformation-based methods in the field of program verification has 
recently gained popularity (see, for instance, ( Albert et al. 20071 |De Angelis et al. 2014a[ 

IFioravanti et al. 20131[Kafle and Gallagher 2015|lLeuschel and Massart 200f)l|Lisitsa and Nemytykh 2008| 

IPeralta et al. 1998]) ). However, fully automated methods based on various notions 
of partial deduction and CLP program specialization cannot achieve the same effect 
as linearization. Indeed, linearization requires the introduction of new predicates 
corresponding to conjunctions of old predicates, whereas partial deduction and pro¬ 
gram specialization can only introduce new predicates that correspond to instances 
of old predicates. In order to derive linear clauses, one could apply conjunctive 
partial deduction ( |De Schreye et al. 1999[ ), which essentially is equivalent to un¬ 
fold/fold transformation. However, to the best of our knowledge, this application 
of conjunctive partial deduction to the field of program verification has not been 
investigated so far. 

The use of linear arithmetic constraints for program verification has been first 
proposed in the field of abstract interpretation (|Cousot and Cousot 19771) . where 
these constraints are used for approximating the set of states that are reach¬ 
able during program execution ( Cousot and Halbwachs 19781) . In the field of logic 
programming, abstract interpretation methods work similarly to LA-solving for 
constrained Horn clauses, because they both look for interpretations of predi¬ 
cates as linear arithmetic constraints that satisfy the program clauses (see, for in¬ 
stance, ( |Benoy and King 1997[ )). Thus, abstract interpretation methods suffer from 
the same theoretical limitations we have pointed out in this paper for LA-solving 
methods. 

One approach that has been followed for overcoming the limitations related 
to the use of linear arithmetic constraints is to devise methods for generating 
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polynomial invariants and proving specifications with polynomial arithmetic con¬ 


straints ( jRodrfguez-Carbonell and Kapur 2007aj Rodrfguez-Carbonell and Kapur 2007b I 

This approach also requires the development of solvers for polynomial constraints, 
which is a very complex task on its own, as in general the satisfiability of these 
constraints on the integers is undecidable ( |Matijasevic 1970| . In contrast, the ap¬ 
proach presented in this paper has the objective of transforming problems which 
would require the proof of nonlinear arithmetic assertions into problems which can 
be solved by using linear arithmetic constraints. We have shown some examples 
(such as the fibonacci program) where we are able to prove specifications whose 
post-condition is an exponential function. 

An interesting issue for future research is to identify general criteria to answer the 
following question: Given a class T> of constraints and a class H of constrained Horn 
clauses, does the satisfiability of a finite set of clauses in Ti imply its ^-solvability? 
Theorem [2] provides a negative answer to this question when T> is the class of LA 
constraints and TL is the class of all constrained Horn clauses. 
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Appendix 


For the proof of Theorem [T] we need the following lemma. 

Lemma 1. (i) The relation rp r0 g defined by OpSem is a functional relation, that is, 
M (OpSem) |= Vpi,. ..,p s ,yi, yi-r P rog{pi ,• -;Ps, yi)/\r pr0 g(pi,- ■-,Ps , 2 / 2 )->■ yi = 2 / 2 ■ 
(ii) A program prog terminates for an environment Jo such that Jo(zi) = Pi,- ■ 
Jo(2s)=Ps and pre(pi,.. ,,p s ) holds, iff 

M(OpSem) \= pre( Pll .. ,,p s ) ^ 3y. r pr0 g(pi,.. .,p s ,y). 

Proof. Since the program prog is deterministic, the predicate r pm g defined by OpSem 
is a functional relation (which might not be total on pro, as prog might not ter¬ 
minate). Moreover, a program prog , with variables zi,...,z s , terminates for an 
environment Jo such that: (i) Jo(^i) = Pi, ■ • ■, Sq(z s ) = p s , and (ii) Jo satisfies pre, 
iff 3y. r pr0 g(pi, ■ ■ ■ ,p s , y) holds in M(OpSem). □ 

Proof of Theorem^ 7] (Partial Correctness). 

Let dom r (X ..., X s ) be a predicate that represents the domain of the functional 
relation r pr0 g. We assume that dom r (X i,..., X s ) is defined by a set Dom of clauses, 
using predicate symbols not in OpSem U Spec, such that 

M(OpSemU Dom) \= (1) 

VAi,..., Xj.((3T .r p rog(X i,..., X s , Y) 4+ dom r (X\,..., X s )) 

Let us denote by Spec the set of clauses obtained from Spec by replacing each clause 
f(X 1 ,...,X s ,Y) 4- B by the clause f(X 1 ,...,X s , Y) 4- dom r (X 1 ,..., X s ), B. 
Then, for all integers pi,...,p s , y, 

M (Specie Dom) \=f(pi,...,p s ,y) implies M(Spec) \= /(pi,... ,p s , y) (2) 

Moreover, let us denote by Spec the set of clauses obtained from SpeS by replacing 
all occurrences of / by r pr0 g. We show that M(OpSem U AuxLi Dom) (= Spec . 

Let S be any clause in Spec . If S belongs to Aux, then M(OpSem U Aux) |= S. 
Otherwise, S is of the form r pr0 g(Xi ,..., X s , Y) 4— dom r (X \,..., X s ), B and, by 
construction, in Fpcorr there are two goals 

G\: false 4— Y > Z, r p rog(X\, ■ ■ ■, X s , Z), B , and 
G 2 ' false 4— Y < Z, r pP og(X 1 ,..., X s , Z), B 
such that OpSem Li Aux U {G\, G 2 } is satisfiable. Then, 

M(OpSem U Aux) |= -G( Y ^ Z A r p rog(Xi,.. ■, X s , Z) A B) 

Since M (OpSemLi Dom) |= r pr og(X \,..., X s , Z) —> dom r (P\,... ,P S ), we also have 
that 

M(OpSemLiAuxUDom) |=-G(T ^ Z/\dom r (X\,..., X s )/\r pr og(Xi ,..., X s , Z)/\B) 
From the functionality of r p rog it follows that 

M (OpSem U Aux U Dom) |= - 1 r p rog(X\,... ,X S , Y) 

•H- (~>3Z ■ r p rog(Xi,.. ■ ,X S , Y) V (r p rog(X\, ■ ■ ■ ,X S , Z) A Y^ Z)) 
and hence, by using (1), 

M(OpSem U Aux U Dom) |= ->3(dom r (Xi ,..., X s ) A -<r p rog(X 1 ,..., X s , Y) A B) 
Thus, we have that 
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M(OpSem U AuxU Dom) |= \/(dom r (X i,..., X s ) A B —> r pr0 g(X i,..., X s , F)) 
that is, clause S is true in M ( OpSemUAuAJDom). We can conclude that M (OpSemU 
AuxU Dom) is a model of Spec U Dom , and since M(Spec U Dom ) is the least model 
of Spec' U Dom , we have that 

M(Spec U Dom ) C M(OpSem U Hm;U Com) (3) 

Next we show that, for all integers pi ,..., p s , y, 

M(Spec i U Dom) \= f(pi,... ,p s ,y) iff M(OpSem) (= r pr og(pi, ■ ■ ■, Ps, y) (4) 
Only If Part of (4). Suppose that M(Spec^ U Dom) (= f{pi, ■ ..,p s , y). Then, by 
construction, 

M(Sped U Dom) |= r pr0 g(Pi, ■■■,Ps,y) 
and hence, by (3), 

M(OpSeml) AuxU Dom) |= r pr og{pi, ...,p s ,y) 

Since r pr og does not depend on predicates in AuxU Dom , 

M(OpSem) f= r pr0 g{pi, ...,p s ,y) 

If Part of (4). Suppose that M (OpSem) |= r pr og(pi, ■■ ■ ,p s ,y)- 
Then, by definition of r pr0 g , 

M(Dom) |= dom r (pi,... ,p s ) (5) 

and 

M(Spec) (= pre(pi, ...,p s ) (6) 

Thus, by (6) and Condition (3.1) of Definition [Q there exists z such that 

M(Spec) \=f(p 1 ,...,p s ,z) (7) 

By (5) and (7), 

M(Spec t U Dom) \= f(pi,... ,p s ,z) (8) 

By the Only If Part of (4), 

M(OpSem) |= r pr0 g(pi, ■ -.,p a ,z) 
and by the functionality of r pTO g , z = y. Hence, by (8), 

M(SpeS U Dom) \= f(pt ,..., p s , y) 

Let us now prove partial correctness. If M(Spec) |= pre(pi,... ,p s ) and prog termi¬ 
nates, that is, M(Dom) |= dom r (pi,... ,p s ), then for some integer y, M(OpSem) |= 
rprog(Pi,- ■ ■ ,Ps,y)■ Thus, by (4), M(Spec i UDom) |= f(p lt ... ,p s ,y) and hence, by 
(2), M ( Spec) |= / (pi ,..., p s , y). Suppose that the postcondition ip is / (p \,..., p s , Zk ). 
Then, by Condition (3.2) of Definition [I] y = Zk- 

Thus, {<p\ prog {ip}- □ 

Removal of the Interpreter 

Here we report the variant of the transformation presented in ( |De Angelis et al. 2014a| ) 
that we use in this paper to perform the removal of the interpreter. In this trans¬ 
formation we use the function Unf(C,A,Cls) defined as the set of clauses derived 
by unfolding a clause C with respect to an atom A using the set Cls of clauses (see 
the unfolding rule in Section POl) . 
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The predicate reach is defined as follows: 

reach{X, X) •<— 

reach(X, Z ) tr(X, Y), reach( Y, Z ) 

where, as mentioned in Section [2 tr is a (nonrecursive) predicate representing one 
transition step according to the operational semantics of the imperative language. 

In order to perform the Unfolding step, we assume that the atoms occurring in 
bodies of clauses are annotated as either unfoldable or not unfoldable. This annota¬ 
tion ensures that any sequence of clauses constructed by unfolding w.r.t. unfoldable 
atoms is finite. In particular, the atoms with predicate initCf, finalCf , and tr are 
unfoldable. The atoms of the form reach^cfi, cf 2 ) are unfoldable if cfi is not associ¬ 
ated with a while or goto command. Other annotations based on a different analysis 
of program OpSem can be used. 


Input: Program OpSem. 

Output: Program OpSempj such that, for all integers p\,.. .,p s ,z k , 
r P rog(pi, ■■■ ,Ps,Zk) GM(OpSem) iff r pr0 g(pi,... ,p s ,z k ) € M(OpSempj). 


Initialization: 

OpSempj := 0; Defs := 0; 

InCls :={r pr0 g(Pi,. ■ .,P s ,Z k ) ^ initCf (C 0 ,Pi,.. ■,P S ), reach(C 0 ,C h ), finalCf (C h ,Z k )}; 
while in InCls there is a clause C which is not a constrained fact do 
Unfolding: 

SpC := Unf(C, A, OpSem), where A is the leftmost atom in the body of C; 
while in SpC there is a clause D whose body contains an occurrence of an 
unfoldable atom A do 

SpC := (SpC- {£>}) U Unf(D , A, OpSem) 
end-while ; 

Definition & Folding: 

while in SpC there is a clause E of the form: H <— e, reach(cf ^ c/ 2 ) 
do 

if in Defs there is no clause of the form: newp(V) •<— reach(cf ^ c/ 2 ) 
where V is the set of variables occurring in reac/i(c/ 1 , c/ 2 ) 
then add the clause N: newp(V) •<— reach(cfi, cf 2 ) to Defs and InCls; 

SpC := (SpC— {F}) U {H <— e, newp( V)} 
end-while; 

InCls := InCls — {C}; OpSempj := OpSempjU SpC; 
end-while; 


RI: Removal of the Interpreter. 

Let us now prove Theorem [3] stating the relevant properties of the RI transforma¬ 
tion. 

The RI transformation terminates. The termination of the Unfolding step is guar¬ 
anteed by the unfoldable annotations. Indeed, (i) the repeated unfolding of the un¬ 
foldable atoms with predicates initCf , finalCf , and tr, always terminates because 
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those atoms have no recursive clauses, (ii) by the definition of the semantics of the 
imperative program, the repeated unfolding of an atom of the form reac/i(c/ 1 , c/ 2 ) 
eventually derives a new reach(cf 3 , c/ 4 ) atom where c / 3 is either a final configura¬ 
tion or a configuration associated with a while or goto command, and in both cases 
unfolding terminates. The termination of the Definition & Folding step follows 
from the fact that SpC is a finite set of clauses. 

The outer while loop terminates because a finite set of new predicate definitions of 
the form newp(V) <— reach(cf 1: cf 2 ) can be introduced. Indeed, each configuration 
cf is represented as a term cf (LC,E)), where LC is a labeled command and E is an 
environment (see Example!]]). An environment is represented as a list of (v, X ) pairs 
where v is a variable identifier and X is its value, that is, a logical variable whose 
value may be subject to a given constraint. Considering that: (i) the labeled com¬ 
mands and the variable identifiers occurring in an imperative program are finitely 
many, and (ii) predicate definitions of the form newp(V) <— reach (c/j, c/ 2 ) abstract 
away from the constraints that hold on the logical variables occurring in c/j and 
c/ 2 , we can conclude that there are only finitely many such clauses (modulo variable 
renaming). 

Point 1: OpSemptf is a set of linear clauses over the integers. By construction, every 
clause in OpSem jij is of the form H •<— c, B, where (i) H is either rp r0 g(Pi, ■ ■ ■ ,P Sl Zk) 
or newp{V) : for some new predicate newp and tuple of variables V, and (ii) B is 
either absent or of the form newp(V), for some new predicate newp and tuple of 
variables V. Thus, every clause is a linear clause over the integers. 

Point 2: OpSemUAuxUFpcorr Is satisfiable iff OpSem ft jU Aux U Fpcorr is satisfiable. 
From the correctness of the unfolding, definition, and folding rules with respect to 
the least model semantics of CLP programs (lEtallc and Gabbriclli 19961) . it follows 
that, for all integers Pi, ■ ■ ■ ,p s ,Zk, 

r P rog(pi,- ■ ■ ,p s ,Zk)€M(OpSem) iff r prog (p 1 ,... ,p s ,z k ) e M(OpSem RI ) (fl) 
OpSemUAuxUFpcorr is satisfiable iff for every ground instance G of a goal in F pcor r, 
M (OpSemU Aux) |= G. Since the only predicate of OpSem on which G may depend 
is rprog , by (fl), we have that M(OpSemU Aux) \= G iff M(OpSem^jU Aux ) |= G. 
Finally, M(OpSem^j U Aux) \= G for every ground instance G of a goal in F pcor r , 
iff OpSem U Aux U Fp Corr is satisfiable. 

Point 3: OpSem U Aux U Fpcorr is LA-solvable iff OpSempu U Aux U Fpcorr is 
LA-solvable. 

Suppose that OpSem U Aux U Fpcorr is TA-solvable, and let £ be an LA-solution 
of OpSem U Aux U Fp Cor r■ Now we construct an LA-solution £^>/ of OpSempu U 
Aux U Fpcorr■ To this purpose it is enough to define a symbolic interpretation for 
the new predicates introduced by RI. 

For any predicate newp introduced by RI via a clause of the form: 

newp(V) <r- reach^c^, cf 2 ) 
we define a symbolic interpretation as follows: 

E Rl(newp(V )) = £(reac/i(c/ 1 , c/ 2 )) 

Moreover, £ 7 ^/ is identical to £ for the atoms with predicate occurring in OpSem. 
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Now we have to prove that £^>/ is indeed an LWsolution of OpSem R j U Aux U 
Fpcorr- This proof is similar to the proof of Theorem [5] (actually, simpler, because 
RI introduces new predicates defined by single atoms, while LIN introduces new 
predicates defined by conjunctions of atoms), and is omitted. 

Vice versa, if T, R j is an IM-solution of OpSem R j U Aux U Fp Corri we construct 
an LWsolution £ of OpSem U Aux U Fp Corr by defining 

£( reach (c/j, c/ 2 )) = T, RI (newp(V)). □ 

Proof of Theorem [|] 

Let LCls be a set of linear clauses and Gls be a set of nonlinear goals. We split the 
proof of Theorem |4] in three parts: 

Termination: The linearization transformation LIN terminates for the input set of 
clauses LCls U Gls; 

Linearity: The output TransfCls of LIN is a set of linear clauses; 

Equisatisfiability: LCls U Gls is satisfiable iff TransfCls is satisfiable. 

( Termination ) Each Unfolding and Definition & Folding step terminates. Thus, 
in order to prove the termination of LIN it is enough to show that the while loop 
is executed a finite number of times, that is, a finite number of clauses are added 
to NLCls. We will establish this finiteness property by showing that there exists an 
integer M such that every clause added to NLCls is of the form: 

newp(Xi,...,X t ) <r- Ai,...,A k (f2) 

where: (i) k < M, (ii) for i = 1 ,...,k, Ai is of the form p(Xi,..., X m ), and 
(hi) {X\, ..., X t } C vars(Ax,..., A k ). 

Indeed, let M be the maximal number of atoms occurring in the body of a goal 
in Gls , to which NLCls is initialized. Now let us consider a clause C in NLCls and 
assume that in the body of C there are at most M atoms. The clauses in the set 
LCls used for unfolding C are linear, and hence in the body of each clause belonging 
to the set U(C) obtained after the Unfolding step, there are at most M atoms. 
Thus, each clause in U(C) is of the form H •<— c, Ai,..., A k , with k < M. Since 
the body of every new clause introduced by the subsequent Definition & Folding 
step is obtained by dropping the constraint from the body of a clause in U(C), we 
have that every clause added to NLCls is of the form (f2), with k < M. Thus, LIN 
terminates. 

(. Linearity ) TransfCls is initialized to the set LCls of linear clauses. Moreover, each 
clause added to TransfCls is of the form H c, newp(X i,..., X t ), and hence is 
linear. 

( Equisatisfiability) In order to prove that LIN ensures equisatisfiability, let us adapt 
to our context the basic notions about the unfold/fold transformation rules for CLP 
programs presented in (lEtallc and Cabbrielli 19961) . 

Besides the unfolding rule of Section l4~2l we also introduce the following definition 
and folding rules. 

Definition Rule. By definition we introduce a clause of the form newp(X) <— G, 
where newp is a new predicate symbol and X is a tuple of variables occurring in G. 
Folding Rule. Given a clause E: H •<— c, G and a clause D: newp(X) ■*— G intro- 
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duced by the definition rule. Suppose that, X = vars(G) fl vars(H,c). Then by 
folding E using D we derive H ■<— c, newp(X). 

From a set Cls of clauses we can derive a new set TransfCls of clauses either by 
adding a new clause to Cls using the definition rule or by: (i) selecting a clause C 
in Cls , (ii) deriving a new set TransfC of clauses using one or more transformation 
rules among unfolding and folding, and (iii) replacing C by TransfC in Cls. We can 
apply a new sequence of transformation rules starting from TransfCls and iterate 
this process at will. 

The following theorem is an immediate consequence of the correctness results for 
the unfold/fold transformation rules of CLP programs (lEtalle and Gabbrielli 1996)) . 

Theorem 6 (Correctness of the Transformation Rules) 

Let the set TransfCls be derived from Cls by a sequence of applications of the 
unfolding, definition and folding transformation rules. Suppose that every clause 
introduced by the definition rule is unfolded at least once in this sequence. Then, 
Cls is satisfiable iff TransfCls is satisfiablc. 

Now, equisatisfiability easily follows from Theorem [G] Indeed, the Unfolding 
and Definition & Folding steps of LIN are applications of the unfolding, defi¬ 
nition, and folding rules (strictly speaking, the rewriting performed after unfolding 
is not included among the transformation rules, but obviously preserves all LA- 
models). Moreover, every clause introduced during the Definition &; Folding 
step is added to NCls and unfolded in a subsequent step of the transformation. 
Thus, the hypotheses of Theorem[G]are fulfilled, and hence we have that LCls U Cls 
is satisfiable iff TransfCls is satisfiable. □ 

Linearized clauses for Fibonacci. 

The set of linear constrained Horn clauses obtained after applying LIN is made out 
of clauses El, E2, E3, and C3, together with the following clauses: 

newl (N1 ,U, V,U,N2 ,U,N3,U) N1=<0, N2=<0, N3=<0 . 

newl(Nl ,U,V,U,N2,U,N3,F3) N1=<0, N2=<0, N4=N3-1, W=U+V, N3>=1 ,new2(N4,W,U,F3) . 

newl(N1 ,U,V,U,N2,F2,N3,U) N1=<0, N4=N2-1, W=U+V, N2>=1, N3=<0,new2(N4,W,U,F2) . 

newl(Nl ,U,V,U,N2,F2,N3,F3) :-Nl=<0, N4=N2-1, N2>=1, N5=N3-1, N3>=1, 
new3(N4,W,U,F2,N5,F3) . 

newl(Nl,U,V,F1 ,N2,U,N3,U) N4=N1-1, W=U+V, N1>=1, N2=<0, N3=<0,new2(N4,W,U,Fl) . 

newl(N1,U,V,F1,N2,U,N3,F3) :- N4=N1-1, N1>=1, N2=<0, N5=N3-1, W=U+V, N3>=1, 
new3(N4,W,U,Fl,N5,F3). 

newl(N1,U,V,F1,N2,F2,N3,U) :- N4=N1-1, N1>=1, N5=N2-1, W=U+V, N2>=1, N3=<0, 
new3(N4,W,U,Fl,N5,F2). 

newl(N1 ,U,V,F1 ,N2,F2,N3,F3) N4=N1-1, N1>=1, N5=N2-1, N2>=1, N6=N3-1, W=U+V, 

N3>=1, newl(N4,W,U,F1 ,N5,F2,N6,F3) . 
new2(N,U,V,U) N=<0. 

new2(N,U,V,F) :-N2=N-l, W=U+V, N>=1, new2(N2,W,U,F) . 
new3(Nl,U,V,U,N2,U) :-Nl=<0, N2=<0. 

new3(Nl,U,V,U,N2,F2) N1=<0, N3=N2-1, W=U+V, N2>=1, new2(N3,W,U,F2) . 
new3(Nl,U,V,Fl,N2,F2) N3=N1-1, N1>=1, N4=N2-1, W=U+V, N2>=1, 
new3(N3,W,U,F1,N4,F2) . 

new3(Nl,U,V,Fl,N2,U) N3=N1-1, W=U+V, N1>=1, N2=<0, new2(N3,W,U,Fl) . 
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Proof of Theorem^ (Monotonicity with respect to LA-Solvability). 

Suppose that the set LCls U Gls of constrained Horn clauses is LA-solvable, and let 
TransfCls be obtained by applying LIN to LCls U Gls. Let E be an LA-solution of 
LCls U Gls. We now construct an LA-solution of TransfCls. For any predicate newp 
introduced by LIN via a clause of the form: 

newp(X i,..., X t ) k- A lt ..., A k 
we define a symbolic interpretation S' as follows: 

Y'{newp{X u ...,X t )) = E(Ai) A ... A E (A*) 

Now, we are left with the task of proving that E 7 is indeed an LA-solution of 
TransfCls. The clauses in TransfCls are either of the form 
false k— c, newq(Xi ,..., X u ) 
or of the form 

newp(X i,..., X t ) k- c, newq(Xi,..., X u ) 

where newp and newq are predicates introduced by LIN. We will only consider the 
more difficult case where the conclusion is not false. 

The clause newp(X i,..., X t ) k— c, newq(X i,..., X u ) has been derived (see the 
linearization transformation LIN in Figure [2J in the following two steps. 

(Step i) Unfolding newp(X i,..., X t ) k— Ai, ..., Ak w.r.t. all atoms in its body using 
k clauses in LCls: 

A\k—ci,Bi ... Ak k— Cjt, B k 

where some of the Bi ’s can be the true and c = c \,..., Ck , thereby deriving 
newp(X i, ...,X t )k-c 1 ,...,Ck,B 1 ,...,B k 
(Without loss of generality we assume that the atoms in the body of the clauses 
are equal to, instead of unifiable with, the heads of the clauses in LCls. ) 

(Step ii) Folding newp(X i,..., X t ) k— Ci, ..., c k , Bi, ..., B k using a clause of the 
form: 

newq{Xi ,..., X u ) k- 5i,..., B k 

Thus, for newq(X i,..., X u )) we have the following symbolic interpretation: 
T,'(newq(Xi, ...,X U )) = E (B x ) A ... A E (B k ) 

To prove that E' is an LT-solution of TransfCls , we have to show that 
LA ^ V(c A T,'(newq(Xi ,..., X u )) —k T,'(newp(Xi ,..., X t ))) 

Assume that 

LA \= c A T,'(newq(Xi ,..., X u )) 

Then, by definition of E 7 , 

LA \= c A E(i?i) A ... A E (B k ) 

Since E is an LA-solution of LCls , we have that: 

LAhV(ciAE(S 1 )^E(A 1 )) ... LA \= W(c k A E(B fc ) —*■ E(A fc )) 
and hence 

LA 1= E(Ai) A ... A E(Afc) 

Thus, by definition of S', 

LA \= Y,'(newp(Xi,..., X t ))- □ 
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